48 research outputs found

    Parallel object classification algorithms in images

    Get PDF
    The contribution concerns  a parallelization of object classification algorithms for a SIMD-type parallel machine. It is assumed that gray-level values of image pixels are located in the orthogonal memory block on which a vector of one-bit processor operates in the bit-serial and   word – parallel mode. For this computer, the Bayes classification algorithm, the K-mean algorithm and  the ensemble average classifier and described

    Preface

    Get PDF
    Prefac

    Near-Region Modification of Total Pressure Fluctuations by a Normal Shock Wave in a Low-Density Hypersonic Wind Tunnel

    Get PDF
    Scientific understanding of the modifications to turbulence due to a normal shock wave at hypersonic speeds is lacking. The overarching research objective of this study was to characterize the effects of a hypersonic shock wave on the structure of locally homogeneous turbulence. The current study, believed to be the first hypersonic shock-turbulence interaction experiments conducted, examined in the near-region of a normal shock wave the effect on the total pressure fluctuations in a low-density hypersonic wind tunnel. Measurements were obtained with a fast-response Pitot pressure probe traversing in the freestream direction. The tunnel freestream noise level was characterized and served as the inflow/upstream condition to the interaction with the normal shock, which was a Mach stem created by the prescribed Mach reflection of two oblique shock waves. Measurements were made downstream of the Mach stem and results (noise values, auto correlation coefficient functions, integral scales, and power spectral density estimates) were compared with the freestream measurements. Overall, it was observed that amplification factors for the noise, time scales, and power spectral density estimates content were higher for the lower Re/m condition (i.e., lower freestream noise) than for the higher Re/m condition (i.e., higher freestream noise). In addition, the amplification factors across the range of unit Reynolds numbers were higher at 4.4 mm downstream from the Mach stem than for 2.4 mm downstream, indicating that the turbulent structures perhaps took time to grow after crossing the shock wave. Amplification was observed to be greater for higher frequencies

    Improving bottleneck features for Vietnamese large vocabulary continuous speech recognition system using deep neural networks

    Get PDF
    In this paper, the pre-training method based on denoising auto-encoder is investigated and proved to be good models for initializing bottleneck networks of Vietnamese speech recognition system that result in better recognition performance compared to base bottleneck features reported previously. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the DBNF extraction for Vietnamese recognition decreases relative word error rate by 14 % and 39 % compared to the base bottleneck features and MFCC baseline, respectively

    Real-Time Smile Detection using Deep Learning

    Get PDF
    Real-time smile detection from facial images is useful in many real world applications such as automatic photo capturing in mobile phone cameras or interactive distance learning. In this paper, we study different architectures of object detection deep networks for solving real-time smile detection problem. We then propose a combination of a lightweight convolutional neural network architecture (BKNet) with an efficient object detection framework (RetinaNet). The evaluation on the two datasets (GENKI-4K, UCF Selfie) with a mid-range hardware device (GTX TITAN Black) show that our proposed method helps in improving both accuracy and inference time of the original RetinaNet to reach real-time performance. In comparison with the state-of-the-art object detection framework (YOLO), our method has higher inference time, but still reaches real-time performance and obtains higher accuracy of smile detection on both experimented datasets

    PEDESTRIAN ACTIVITY PREDICTION BASED ON SEMANTIC SEGMENTATION AND HYBRID OF MACHINES

    Get PDF
    The article presents an advanced driver assistance system (ADAS) based on a situational recognition solution and provides alert levels in the context of actual traffic. The solution is a process in which a single image is segmented to detect pedestrians’ position as well as extract features of pedestrian posture to predict the action. The main purpose of this process is to improve accuracy and provide warning levels, which supports autonomous vehicle navigation to avoid collisions. The process of the situation prediction and issuing of warning levels consists of two phases: (1) Segmenting in order to definite the located pedestrians and other objects in traffic environment, (2) Judging the situation according to the position and posture of pedestrians in traffic. The accuracy rate of the action prediction is 99.59% and the speed is 5 frames per second

    DEVELOPMENT OF HIGH-PERFORMANCE AND LARGE-SCALE VIETNAMESE AUTOMATIC SPEECH RECOGNITION SYSTEMS

    Get PDF
    Automatic Speech Recognition (ASR) systems convert human speech into the corresponding transcription automatically. They have a wide range of applications such as controlling robots, call center analytics, voice chatbot. Recent studies on ASR for English have achieved the performance that surpasses human ability. The systems were trained on a large amount of training data and performed well under many environments. With regards to Vietnamese, there have been many studies on improving the performance of existing ASR systems, however, many of them are conducted on a small-scaled data, which does not reflect realistic scenarios. Although the corpora used to train the system were carefully design to maintain phonetic balance properties, efforts in collecting them at a large-scale are still limited. Specifically, only a certain accent of Vietnam was evaluated in existing works. In this paper, we first describe our efforts in collecting a large data set that covers all 3 major accents of Vietnam located in the Northern, Center, and Southern regions. Then, we detail our ASR system development procedure utilizing the collected data set and evaluating different model architectures to find the best structure for Vietnamese. In the VLSP 2018 challenge, our system achieved the best performance with 6.5% WER and on our internal test set with more than 10 hours of speech collected real environments, the system also performs well with 11% WE
    corecore